16 research outputs found

    In silico microdissection of microarray data from heterogeneous cell populations

    Get PDF
    BACKGROUND: Very few analytical approaches have been reported to resolve the variability in microarray measurements stemming from sample heterogeneity. For example, tissue samples used in cancer studies are usually contaminated with the surrounding or infiltrating cell types. This heterogeneity in the sample preparation hinders further statistical analysis, significantly so if different samples contain different proportions of these cell types. Thus, sample heterogeneity can result in the identification of differentially expressed genes that may be unrelated to the biological question being studied. Similarly, irrelevant gene combinations can be discovered in the case of gene expression based classification. RESULTS: We propose a computational framework for removing the effects of sample heterogeneity by "microdissecting" microarray data in silico. The computational method provides estimates of the expression values of the pure (non-heterogeneous) cell samples. The inversion of the sample heterogeneity can be facilitated by providing accurate estimates of the mixing percentages of different cell types in each measurement. For those cases where no such information is available, we develop an optimization-based method for joint estimation of the mixing percentages and the expression values of the pure cell samples. We also consider the problem of selecting the correct number of cell types. CONCLUSION: The efficiency of the proposed methods is illustrated by applying them to a carefully controlled cDNA microarray data obtained from heterogeneous samples. The results demonstrate that the methods are capable of reconstructing both the sample and cell type specific expression values from heterogeneous mixtures and that the mixing percentages of different cell types can also be estimated. Furthermore, a general purpose model selection method can be used to select the correct number of cell types

    Robust regression for periodicity detection in non-uniformly sampled time-course gene expression data

    Get PDF
    <p>Abstract</p> <p>Background</p> <p>In practice many biological time series measurements, including gene microarrays, are conducted at time points that seem to be interesting in the biologist's opinion and not necessarily at fixed time intervals. In many circumstances we are interested in finding targets that are expressed periodically. To tackle the problems of uneven sampling and unknown type of noise in periodicity detection, we propose to use robust regression.</p> <p>Methods</p> <p>The aim of this paper is to develop a general framework for robust periodicity detection and review and rank different approaches by means of simulations. We also show the results for some real measurement data.</p> <p>Results</p> <p>The simulation results clearly show that when the sampling of time series gets more and more uneven, the methods that assume even sampling become unusable. We find that M-estimation provides a good compromise between robustness and computational efficiency.</p> <p>Conclusion</p> <p>Since uneven sampling occurs often in biological measurements, the robust methods developed in this paper are expected to have many uses. The regression based formulation of the periodicity detection problem easily adapts to non-uniform sampling. Using robust regression helps to reject inconsistently behaving data points.</p> <p>Availability</p> <p>The implementations are currently available for Matlab and will be made available for the users of R as well. More information can be found in the web-supplement <abbrgrp><abbr bid="B1">1</abbr></abbrgrp>.</p

    Perspective on Oncogenic Processes at the End of the Beginning of Cancer Genomics.

    Get PDF
    The Cancer Genome Atlas (TCGA) has catalyzed systematic characterization of diverse genomic alterations underlying human cancers. At this historic junction marking the completion of genomic characterization of over 11,000 tumors from 33 cancer types, we present our current understanding of the molecular processes governing oncogenesis. We illustrate our insights into cancer through synthesis of the findings of the TCGA PanCancer Atlas project on three facets of oncogenesis: (1) somatic driver mutations, germline pathogenic variants, and their interactions in the tumor; (2) the influence of the tumor genome and epigenome on transcriptome and proteome; and (3) the relationship between tumor and the microenvironment, including implications for drugs targeting driver events and immunotherapies. These results will anchor future characterization of rare and common tumor types, primary and relapsed tumors, and cancers across ancestry groups and will guide the deployment of clinical genomic sequencing

    Microdissection of microarray data from heterogeneous cell populations-1

    No full text
    <p><b>Copyright information:</b></p><p>Taken from "microdissection of microarray data from heterogeneous cell populations"</p><p>BMC Bioinformatics 2005;6():54-54.</p><p>Published online 14 Mar 2005</p><p>PMCID:PMC1274251.</p><p>Copyright © 2005 Lähdesmäki et al; licensee BioMed Central Ltd.</p>les #2, #3, and #4 are used to estimate the expression profiles of the pure colon cancer cells and lymphocytes. The height of each bar corresponds to the value of the most significant PCA component. Each bar corresponds to a heterogeneous sample, reference sample, or estimated expression profile and is labelled with the corresponding text

    Robust regression for periodicity detection in non-uniformly sampled time-course gene expression data-1

    No full text
    <p><b>Copyright information:</b></p><p>Taken from "Robust regression for periodicity detection in non-uniformly sampled time-course gene expression data"</p><p>http://www.biomedcentral.com/1471-2105/8/233</p><p>BMC Bioinformatics 2007;8():233-233.</p><p>Published online 2 Jul 2007</p><p>PMCID:PMC1934414.</p><p></p>s time in hours and the first time point corresponds to 8:40 am. The approximately 24-hour cycle can be seen well. The figure legends show the gene names corresponding to the plotted time series

    Microdissection of microarray data from heterogeneous cell populations-4

    No full text
    <p><b>Copyright information:</b></p><p>Taken from "microdissection of microarray data from heterogeneous cell populations"</p><p>BMC Bioinformatics 2005;6():54-54.</p><p>Published online 14 Mar 2005</p><p>PMCID:PMC1274251.</p><p>Copyright © 2005 Lähdesmäki et al; licensee BioMed Central Ltd.</p>les #2, #3, and #4 are used to estimate the expression profiles of the pure colon cancer cells and lymphocytes. Each bar corresponds to a heterogeneous sample, reference sample, or estimated expression profile and is labelled with the corresponding text. The height of each bar corresponds to the value of the most significant PCA component

    Robust regression for periodicity detection in non-uniformly sampled time-course gene expression data-0

    No full text
    <p><b>Copyright information:</b></p><p>Taken from "Robust regression for periodicity detection in non-uniformly sampled time-course gene expression data"</p><p>http://www.biomedcentral.com/1471-2105/8/233</p><p>BMC Bioinformatics 2007;8():233-233.</p><p>Published online 2 Jul 2007</p><p>PMCID:PMC1934414.</p><p></p>led according to the experimental mussel data. The sampling of the second time series (c) is an artificially deteriorated version of the first one. The corresponding spectral estimates, (b) and (d), include the ideal periodogram (Ideal periodogram), as if the time series was sampled uniformly and had no added noise, the periodogram of the samples (Periodogram). ignoring time indices, and the M-estimate (Robust (M) estimator)

    Microdissection of microarray data from heterogeneous cell populations-7

    No full text
    <p><b>Copyright information:</b></p><p>Taken from "microdissection of microarray data from heterogeneous cell populations"</p><p>BMC Bioinformatics 2005;6():54-54.</p><p>Published online 14 Mar 2005</p><p>PMCID:PMC1274251.</p><p>Copyright © 2005 Lähdesmäki et al; licensee BioMed Central Ltd.</p>wn in Equation (4)

    Microdissection of microarray data from heterogeneous cell populations-5

    No full text
    <p><b>Copyright information:</b></p><p>Taken from "microdissection of microarray data from heterogeneous cell populations"</p><p>BMC Bioinformatics 2005;6():54-54.</p><p>Published online 14 Mar 2005</p><p>PMCID:PMC1274251.</p><p>Copyright © 2005 Lähdesmäki et al; licensee BioMed Central Ltd.</p>de cells and the normalized expression value, respectively. Symbols: the measured expression values (blue circles), the estimated expression values of the pure cell types (red stars), regression-based confidence intervals (red points), and bootstrap-based confidence intervals (red x-marks)

    Microdissection of microarray data from heterogeneous cell populations-6

    No full text
    <p><b>Copyright information:</b></p><p>Taken from "microdissection of microarray data from heterogeneous cell populations"</p><p>BMC Bioinformatics 2005;6():54-54.</p><p>Published online 14 Mar 2005</p><p>PMCID:PMC1274251.</p><p>Copyright © 2005 Lähdesmäki et al; licensee BioMed Central Ltd.</p>xpressed based on the heterogeneous measurements (samples #2 and #4, blue circles). After the inversion of the mixing effect, however, the expression difference between the estimated pure colon cancer cells and lymphocytes (red stars) meet even a more stringent criterion of differential expression. The horizontal and vertical axes correspond to the fraction of lymph node cells and the normalized expression value, respectively. Symbols: the heterogeneous samples (blue circles), the estimated expression values (red stars), and the measured expression values of the pure colon cancer cells (blue squares). See text for more details
    corecore